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1. Introduction 

In the past few decades, the spectral properties of regular graphs had attracted 
considerable attention of researchers from diverse disciplines such as combinatorics, 
information theory, theoretical and applied computer science, quantum chaos and 
spectral theory (to list only a few). In order to understand better the eigenvectors 
of the Laplacian on such graphs, we try to establish some analogies between those 
eigenvectors, to eigenvectors of chaotic manifolds. The tools we are using for this task, 
are mostly probabilistic. 
In the following, we consider the statistical properties of G{n, d) = {V, E) - a, graph 
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which is chosen uniformly at random from the set of rf— regular graphs on n vertices. 
The graph can be uniquely described by its adjacency matrix A (also known as the 
connectivity matrix), where Aij = 1 if f j and Vj are adjacent vertices in G, or zero 
otherwise. The action of the discrete Laplacian on a function / : \/ ^ M is 

Lf^ = Y.(U - /,) (1) 

where /» = f{vi), and the summation is over all vertices Vj which are adjacent to Vi. For 
regular graphs, the Laplacian can be expressed in a matrix form as L = dl — A, therefore 
an eigenvector of the Adjacency matrix with an eigenvalue A, is also an eigenvector of 
the Laplacian, with an eigenvalue fi = d — X. 

The eigenvalues and eigenvectors of A contain valuable information about the structure 
of the graph. The relations between the spectrum of the adjacency matrix to the 
expansion properties of G (see section 2.2), have been thoroughly investigated and 
were found useful for coping with a variety of tasks. To mention some, the study of 
expanders is related to evaluation of convergence rates for Markov chains and the study 
of metric embeddings in mathematics. In computer science, one uses expanders for the 
analysis of communication networks, construction of efficient error-correcting codes, and 
the theory of pseudorandomness (for a detailed survey, see [3]). The eigenvectors of A 
are being successfully used in various algorithms, such as partitioning and clustering 
(e.g. [4, 5, 6]). 

As one can learn from spectral properties about the structure of the graph, we can go the 
other way around. In the study of quantum properties of (classically) chaotic systems, 
one is commonly interested in statistical properties of the spectrum and eigenstates of 
the corresponding Schrodinger operator. While quantum operators on graphs (such as 
the Laplacian) are easy to define, the classical analogue is not obvious. A plausible 
classical extension would be to consider a random walk on the graph. For a connected 
graph which is not bipartite, it is known that random walks are mixing fast (e.g. [7]). 
Since a fast mixing system is chaotic, one might expect that the quantum properties 
of a generic graph will be related in some manner to those of chaotic systems. This 
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conjecture was supported by numerical simulations [8], and recently found an explicit 

formulation [9], relating spectral properties to cycles in G. 

The main goal of the current paper is the characterization of statistical properties of 

eigenvectors of {n, d) graphs. As this work is inspired by analogue findings for chaotic 

wave functions, and uses extensively several combinatorial properties of random regular 

graphs, we dedicate the next section to review some relevant results concerning chaotic 

billiards and {n, d) graphs. 

In section 3 we examine correlations of the eigenvectors at different vertices. We 

derive an explicit limiting expression for the (short distance) empirical covariance, which 

depends only on the eigenvalue A. 

In section 4 we provide numerical evidence which suggest that the distribution of the 

eigenfunctions' components can be approximated by a Gaussian measure. 

Assuming a Gaussian measure, we dedicate section 5, to the evaluation of some expected 

properties of the nodal pattern of the eigenvectors, such as the expected number of nodal 

domains and their expected structure. 

2. A brief review of previous results 

2.1. Eigenvectors of chaotic billiards 

A classical billiard system is defined as a point particle, which is confined to a domain 
V C M". The particle moves with a constant speed along geodesies and collides 
specularly with the boundary of T). Depending on the shape of the boundary, the 
dynamics of the particle can be classified as chaotic or regular. A quantum analogue 
would be to consider eigenstates of the Schrodinger operator for a particle confined to 
V: 

-Az3^(r) = e^iv) (2) 

- the Laplace-Beltrami operator, restricted to "D, with Dirichlet boundary condition. 
The statistics of the wave function '?/'(r), rely on the classical properties of T). In [10], a 
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limiting expressions for the auto correlation function 

was calculated, where (...) denotes averaging over an appropriate spectral window 
[k,k + ek], in the semi-classical limit A; ^ oo |. It was shown that for an integrable 
domain, C(ro, r, k) is anisotropic and depend on the symmetries of the domain. For 
chaotic domains, the limit of the auto correlation function is isotropic and universal (for 
points which are far enough from the boundary [11]), and can be written explicitly as 
Mm C(r k) = r("/2)Jn/2-i(|fa|) ,,, 

where Ju{x) is the z/th Bessel function of the first kind, which decays asymptotically as 
Ji,{x) ~ cos(a; + (pu)/Vx. Moreover, it was suggested that in the semi-classical limit, 
the eigenvectors statistics reproduce a Gaussian measure (the random wave model), i.e. 
for {ri, r2...rj„} G V, the probability density of ip(R) = {ipi^ri), ...■ip{rm))'^ for a wave 
function chosen uniformly from the spectral window [k, k + ek], is converging to 

(5) 



p(ih(lVj) = ; exp 



-^V^(R)^C,tV(R) 



where the covariance matrix is given by {Ck)ij = limjfc^ooC(rj — rj,k). Although the 
random wave model is not supported by any rigorous derivation, it was found consistent 
with some numerical observations, such as [12, 13]. 

A different characterization of the eigenvectors, based on their nodal pattern, was 
suggested in [12] for two-dimensional manifolds: since it is always possible to find a basis 
in which the eigenfunctions of —A are real, they can be divided into nodal domains, 
connected regions of the same sign, separated by nodal lines on which the eigenfunction 
vanishes. By Courant theorem [14], the jth eigenstate contains no more than j nodal 
domains. The authors have investigated the limiting distribution (as j -^ oo) of the 
parameter C,j = ^j/ j-, where Vj is the number of nodal domains in the jth wave function. 
They have derived an explicit expression for separable domains, which depends on the 
X As the density of states is scaled as fc"~^, we demand that efc ^ 0, but efc"^^ ^ oo, so that the 
energy does not vary significantly along the window, and the number of states is large enough. 
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explicit structure of the domain. For chaotic bilhards, they have observed a universal 
limiting distribution, independent of the investigated domain. 

This limiting distribution found an intriguing explanation by [2] , where the nodal pattern 
is described in terms of critical bond percolation model. While for some measures on 
the nodal lines [15, 16] the correspondence is not complete, general arguments such as 
[17] implies that the scaling limit of both of the models should converge. In addition, 
the model predicts with a great accuracy diverse properties of the nodal pattern and 
the nodal lines [2, 18, 19, 13]. 

2.2. Some properties of large regular graphs 

Throughout this paper, we will focus our attention on (n, d) graphs, where d > Sis fixed, 
and n ^ oo. With a high probability, those graphs are highly connected, or expanding. 
An expander graph G = {V, E) has the property that for every (small enough) subset 
S d V , the edge boundary dS, which is the set of edges connecting S to G \ S, is 
proportional in size to S itself. 

A related property of (n, d) graphs, which will be used repeatedly in the following, is 
the local tree property. It is known [20] that for k < \og^_^{n), the numbers Ck 
of cycles of length k in an (n, d) graph, are distributed asymptotically as independent 
Poisson random variables with mean E{Gk) = {d — lY /2k. Therefore, for any e > 
and as n -^ oo, almost all of the vertices of an (n, d) graph are not contained in a cycle 
of shorter length than (1 — e)log^_i(n), with high probability. Equivalently, the ball 
of radius log^_]^(n)/(2 + e) around almost all of the vertices is a tree. The volume of 
a ball of radius k in an (n, rf) graph grows exponentially with k for k < \og^_^{n). In 
fact, the diameter of G may differ from log^_^(nlogn) only by a (small) finite number 
independent of n [21]. In the following we will express logarithms in the natural tree 
base: log(x) = log(^_i(a;). 

The adjacency matrix of a graph is real and symmetric, therefore it has a real spectrum, 
which is supported on [— rf, d\. As n ^ oo, the spectral measure on A converges to the 
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Kesten-McKay measure [22]: 

f f ^'TV for \X\<2Vd^l 

I for |A| > 2^/d^^: 

We will use the following notation throughout this paper: Eigenvalues of the adjacency 
matrix are denoted by A and those of the Laplacian by /i; superscript indices denote 
eigenvectors: A/*^*^ = Aj/*^*^; Subscript indices will denote vertices: /j = /W(t>j). We 
choose the normalization (/, /) = n, so that E{ff) = 1, irrespective of n. We enu- 
merate the eigenvalues in the customary order: d = \i > A2... > A„, or equivalently 
= fii < fi2--- < /in- The first eigenvector (or the ground state of L) is the constant 
vector f^^^ = (1,1...,!)-^. As the eigenvectors are orthogonal, we get for i > I that 
^. / = (/'■^•', /'-*•') = 0. We would like to emphasize again that for a regular graph, 
the eigenvectors of the adjacency matrix and the Laplacian are identical. Therefore, all 
the results that will be derived in the following are applicable (up to rescaling of the 
eigenvalue) to both of the operators. 

For a graph G = {V,E) and a function f{V), a positive (negative) nodal domain of / 
is a maximal connected component of G, so that f{v) > ( f{v) < ) for all of the 
vertices in the component. § The nodal count of /, which will be denoted by z/, is the 
number of nodal domains of /. 

In [23], Courant theorem is generalized to connected discrete graphs, showing that the 
jth eigenvector of the Laplacian contains no more than j nodal domains. A constraint 
on the allowed shapes of domains was derived in [24]: Since an adjacency eigenfunction 
satisfies: A/j = Xli^j /j > if -^ > ^ (fo^ k E N), then for every positive (negative) nodal 
domain, the maximum (minimum) of the domain must have at least k + 1 adjacent 
vertices of the same sign, therefore the minimal size of a domain is k + 2. Similarly, if 
A < , every vertex has at least one adjacent vertex with an opposite sign, therefore for 

a negative eigenvalue, nodal domains cannot have inner vertices. In addition, by adding 
§ In the following wc will ignore the possibility that for some vertex f(v) vanishes, as this event is of 
measure zero for Laplacian eigenvectors of (n, d) graphs. 
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assumptions on the structure of the graph (for example, by considering trees only), it 
is possible to bound the minimal size of a domain for a given eigenvalue [25]. We refer 
the reader to [26] for a review on the nodal pattern of general graphs. 



3. The covariance of an eigenvector 

In this section we would like to estimate the correlation between two distinct components 
of an adjacency eigenvector in an (n, d) graph. The distance in G between two vertices 
Vi, Vj G V" is the length of the shortest walk in G from Vi to Vj - we denote the distance 
by \i — j\- Setting the fc-adjacency operator to be 

{1 for li — ?'| = A; 
(7) 
otherwise 

we evaluate the correlations between two components of an eigenvector / at distance k, 

by computing the empirical fc-covariance of / and G, defined as 

covr (/, G) = ^^Y1 f^f^ = :a^(/' ^^/) (8) 

\i-j\=k 

where Aik = Xlj ii^k)ij is the number of (directed) /c— neighbors in G. 
For k < log n/2 , we can take advantage of the local tree property, in order to find an 
explicit limiting expression for (8). Under the tree approximation, A4k = nd{d — l)'^^^. 
Moreover, for a tree, {Ak)ij = 1 if and only if there is a (unique) walk of length k from Vi 
to Vj which do not retrace itself (do not backscatter) at any step. Therefore, for a tree 
the operator A^ is identical to the 'non-retracing operator', introduced and calculated 
in [27]. Clearly, Aq = I,Ai = A, where I is the identity matrix. A2 = A'^ — dl, as one 
has to eliminate from A'^ (which correspond to all possible walks of length 2 in G) the 
walks which return back to their origin at the second step. In a similar manner, one 
gets for A; > 2 that 

Ak = AAk-i-{d-l)Ak^2 . (9) 
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The first term is due to all paths of length k which do not retrace in the first k — 1 steps, 
while the second term eliminates paths which have not retraced in the first k — 1 steps, 
but do retrace in the kth step. Since A'^f = X'^f, we get by substituting (9) in (8) that 
in the limit the empirical covariance converges to 

CovrW = ;^^;^/>.(A) . (10) 

where Pk{X) is given by the recursion relation: 

Pi(A) = A 

P2i\) = \^-d (11) 

Pk+2{X) = XPk+i{X)-{d-l)Pk{X) ■ 

Introducing Chebyshev polynomials of the second kind [28]: 

sin((A; + 1)6*) , , 

t/fc cos^ = ^\ , ^ ^ 12 

The solution to this recursion relation, subject to the initial conditions, can be written 
as 

The functions Cov|'^'^^(A) are orthogonal polynomials of degree k in \ with respect to 
Kesten-McKay measure p{\) (6), satisfying 

Covr (A) CoYtr{X)p{X)dX = ^^^_\^,_, 4.fe' . (14) 

This results has a simple combinatorial interpretation. Following (8), the left hand side 
of (14) is nothing but 

n ~ ~ 1 1 ^ ^ 

lim Tr(AfeAfcO = ,, , ^,, ... Tr(AfcAfcO (15) 

n^oo MkMk' d{d - 1)''-^ Mk' 

Note that TT{AkAk') is the number of closed walks in G, which are combined from a non 
retracing walk of length k, followed by a non retracing walk of length k'. The only way 
to perform such a walk on a tree is by going back and forth, therefore TY{AkAk') = 0, 
if /c 7^ k', or Aik (the number of non backscattering walks of length A;) for k = k'. A 
substitution yield the identity (14). 
As |[/fc(3^)| ^ k for \x\ < 1, the limiting expression for the covariance (13) is an oscillatory 
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function which decays as {d — 1)^*^/^. This behavior is analogous to the expected rate of 
decay for continuous chaotic manifolds (4). The surface of a ball of radius r in M" grows 
as S{r) ~ r"-^^, while for regular trees, the surface of the ball grows as S{r) r^ {d — ly . 
Therefore, in both of the cases, the rate of decay of the covariance is proportional to 
the root of the area of the sphere. 

While for short distances the empirical covariance converges to Cov^'''^^(A), the 
validity of this approximation is expected to deteriorate as k exceed logn/2. As the 
computation presented above does not provide an error estimate, we have turned to 
numerical simulations. We have calculated numerically the empirical covariance (8) for 
several realizations of regular graphs, and compared them to the limiting expression 
(13). The graphs were generated following [29] and using MATLAB. As expected, 
for short distances the matching is very good, while for k ~ logn the deviations are 
evident. Figure 1 demonstrate this behavior for a realization of (4000, 3) graph, where 
log2(4000) = 11.97. 

Although for large distances, the empirical covariance deviates from Cov^''^^(A), it 
seems that the expected rate of decay is reproduced quite well, so that asymptotically 
I Cov^™'*'(/, G)\ ~ {d— l)^^/"^. In order to test this assumption, we introduce for a given 
G{n^ d) and /c, the two norms 



Nr{G) = y$:(Covr(/»,G))2 (16) 



f^^(G) = ^ j^i^ov'rix^y 



ANiG, k) = ^^\^, ^ ^l^^ (17) 



N, , , 

n 

and define the (scaled) norm deviation as: 

j\Temp _ j\Ttree 

If the asymptotic rate of decay of Cov^*"^^ and Cov^™^ is similar ||, then |AA^| will be 
bounded away from 1 for all k and n. 

The (average) norm deviation of an eigenvector is a function of the three parameters 
II meaning that for sonic positive ci (d) , C2 (d) and for all n and fc, we get with high probability that 
ciiVf-(G) < Nl'^^iG) < c27Vr''(G). 



Eigenvectors of the discrete Laplacian on regular graphs - a statistical approach 



10 



04 


I ' 


, 


1 1 


ii 




, 








k-4 




0.3 


\\ 




k=5 


v* 




i ^00^ /^^x 


k=B 


kI 


0.2 




if 


0.1 

o 
O 
-0.1 




\ * t" 




^ 


/<>«,«»^ ' 


^ \w 


-0.2 


I 


v/ 


\»^ 


-0.3 
n 1 


i 

\ 

i , 


<3©0 


1 1 1 


-°' -2.5 


-2 -1.5 -1 -0,5 0.5 1 
A 


1.5 2 2.5 


0.04 








k=11 




K 






k-12 


0.03 




- 


0.02 


1 1 

1 l#f 


p\^ -„.. ,s?-- ,/{„ 


--. 0.01 


'/ \ 




"^A^ l\ 


^ 




\d iP \ 


o 


.M 


.' /^ W A \n\+' / + 


y\ 


-0.01 


/' 1 


L / ( E / \ \ / 


* d L ' 




' 




\V^ 4 


-0.02 


^^ 


^^^ ^ 


-0.03 






+ h. 


-0 04 


1 


1 1 1 1 1 1 


1 1 1 



-2.5 -2 -1.5 -1 -0.5 0.5 



1,5 2 2,5 



Figure 1. A comparison between Cov^'"'^'^ (A) (marked by lines), and Cov^.™^(/, G), for 

a single realization of (^(4000, 3) (denoted by different markers), where log^_i{n) = 

11.97. 

Upper figure: a comparison for fc = 4, 5, 6. Lower figure: k = 11, 12. 



n, (i and k. However, a comparison of the norm deviation for several realizations of (n, d) 
graphs with various values of n and d, suggest that it might be well approximated by a 
function of two parameters only - d, and the scaled parameter k/log^_i{n) (see figure 
2-left). For a fixed value of log^_i(?7,), the deviation decreases as d increases (figure 
2-right). In addition, for k/\ogn < 1, the norm deviation is bounded away from one, 
for any d. Since the diameter of an (n, d) graph is very close to log(nlogn), we get that 
ma.x{k)/ logn approaches one as n approaches infinity, therefore we expect |AA^| to be 
bounded away from one for all k. 
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Figure 2. The scaled norm deviation (17) for several realizations of regular graphs. 

Left: AiV, as a function of fc/log3(n), for 3 realizations of 4-regular graphs consisting 

of 2000,3000 and 4000 vertices. 

Right: AiV, as a function of fc/log^_i(n), for (4000,3), (4000,4), (4000,5) and (4000,6) 

graphs. 



4. The limiting distribution of eigenvectors 

In the last section we have shown some of the similarities between the limiting 
expressions for the covariance of regular graphs and the autocorrelation of chaotic 
billiards (4). In this section and the next, we present extensive numerical evidence, which 
suggest that the distribution of eigenvectors of (n, d) graphs follows a Gaussian measure, 
resembling the conjectured distribution [10] for eigenvectors of quantum billiards (see 
section 2.1). 

In order to examine the limiting distribution of the eigenvectors components of an (n, d) 
graph, we have to define at first what is the ensemble we are interested in. As an example, 
we can fix a graph G = (V, -E), a vertex Vi G V and ask for the distribution /^ of the 
ith component of a randomly chosen eigenvector. A second option is to fix an eigenvalue 
Aq € [—2\/d — 1, 2\/d — 1], and ask for the limiting distribution of an arbitrary vertex, 
where we choose a graph on random, and look at the eigenvector which has the closest 
eigenvalue to Xq. In the same manner, it is possible to fix an (n, d) graph, an eigenvector 
f^^^ and ask for the limiting distribution /^ of a randomly chosen vertex Vi G V. 
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In the following, we suggest that as n -^ oo, the distribution of the eigenvectors 
components, with respect to the first two ensembles is converging to a Gaussian. As 
for the third ensemble, numerical simulations (e.g. [3]), together with the results of this 
work imply that a limiting distribution for that ensemble may not exist. 
For the sake of clarity, we begin by considering the limiting distribution of a single 
vertex, which will be followed by the study of a multivariate version. We will denote 
by p{x) and $(a;), the density and the cumulative distribution function (cdf) of the 
standard normal variable: 

p{x) = -^e-^'/2 , <D(x) = r p{y)dy (18) 



4-.1. The limiting univariate distribution 

Based on numerical simulations we suggest the following limits: 

• Hypothesis I (univariate): For asymptotically almost any G{n,d) = {V,E) and 
Vi G V, the probability that f^^\vi) < x, where f^^^ is an adjacency eigenvector, 
chosen uniformly from {f^^', f^^\ ...Z*^"-*}, is bounded by 

\f{Pivi) <x)- $(a;)| < A{n) (19) 

where A(n) ^ as n ^ oo. 

• Hypothesis II (univariate): For a given Aq G [—2^/d — l,2^/d — 1] and I < i < n, 
the distribution of /'■^"•'(fi), is converging to the normal distribution (18), where 
/'^'^"^ is an adjacency eigenvector with the closest eigenvalue to Aq of a uniformly 
randomly chosen (n, d) graph. 

These assumptions may be examined by comparing the appropriate empirical cumulative 
distribution functions to ^{x). A plausible measure to the distance between two 
cumulative distributions Vi{x),V2{x) is the Kolmogorov-Smirnov (KS) distance: 

Vf^ = snp\Vi{x)-V2{x)\ (20) 
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According to Kolmogorov theorem, if V2{x) is an empirical cdf of n iid variables, 
generated with respect to the cdf Vi (x) , then the distribution of Vf^ is given by 



lim P ( Pf ^ < 






.i)<^~^e-^''y 



(21) 
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Figure 3. Upper left figure: a comparison between !Fi{x) for two vertices of a 
realization of (4000, 3) graph, the empirical cdf of 4000 iid normal variables and <i>(a;). 
Upper right figure: a comparison between (21), the empirical cdf of V^^ for 4000 
vertices of a (4000, 3) graph and the empirical cdf of T)^^ for 4000 independent vectors 
of 4000 iid normal variables. 

Lower figure: deviations of the empirical cdf of V^^ from (21), for a single realization 
of G(4000, 3), G(2000, 3) , G(4000, 6) and the iid normal variables. A positive deviation 
corresponds to a faster convergence than predicted by (21). 



In order to test the first hypothesis we have generated realizations of (n, d) graphs for 
several values of n and d. For a given graph and a vertex Vi G V , the empirical cdf is 
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given by 

H^) = lM3\f?<x} (22) 

A numerical comparison between J^i{x) to $(x) shows persuasively that the differences 
between the two distributions are of order 1/ ^/n, as is demonstrated in the upper left 
plot in figure 3. 

Since the components f\ are not independent, the KS distance between Ti{x) and 
^(x) is not expected a priori to follow (21). However, the measured KS distances for 
different vertices of the same graph, was found to be very close to (21), as can be seen 
in figure 3 (upper right figure). In fact, the observed convergence of V^^ seems to be 
slightly faster than predicted by (21), irrespective of n and d (figure 3- lower figure). 
In order to examine hypothesis II, we have generated 10000 realizations of (6000, 3) 
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Figure 4. The deviation of the empirical cdf, calculated according to the requirements 
of hypothesis II, from ^{x). The different curves correspond to Ai = 2.82, A2 = 0, 
A3 = —2.82 and the empirical cdf of 10000 iid normal variables. 



graphs, for which the spectrum is supported on ±2v2 ^ 2.828. We have compared 
J^^{x) - the empirical cdf of the appropriate components, for various values of A, varying 
from —2.82 up to 2.82. In this case as well, the distance between J^^{x) and $(x) tends 
to zero as 1/ \/n. In addition, no substantial differences in the limiting distribution, nor 
in the rate of convergence were observed for different values of A, as we demonstrate in 
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figure 4. 

4-2. The multivariate limiting distribution 

In order to formulate a multivariate version of the suggested distributions, we introduce 

the following notation: 

For a graph G = {V,E), and U = {ui,...Um} C V, the distance matrix D{U,G) is 

defined as: {D{U,G))ij = \i — j|. diam(f/) = max(D(f/, G)) is the diameter of U in 

G. For a given d, a distance matrix is 'good', if it can be embedded in a rf— regular 

tree. A subset U G V is good, if its distance matrix is good. We should note that 

if G is an (n, d) graph, then (due to the local tree property) almost any U G V with 

diam(f/) < logn/2 is good. 

For U gV and A G [—2^/d — 1, 2^/d — 1], we define the limiting covariance matrix 

CxiU) by (Ca)., = Covg-(A), where Covf ^(A) is given by (13). 

We will denote by j9(x, S) and $(x, S), the density and the cdf of the multinomial 

variable x = {xi, ...Xm)'^ with mean zero and covariance matrix S: 

p(x,S) = ^=i===e-<-'^-^-)/2 , $(x,S)= r p(y,S)rfy (23) 

Equipped with this notation, we suggest the following for Aq G [—2\^d — \,2\Jd — 1] 
and a fixed m: 

• Hypothesis I (multivariate): For almost any G{n,d) = {V,E) and a good U gV 
of small diameter in G, the probability that f{U) = (/(-ui), ...f{um)Y' < x, where / 
is an adjacency eigenvector, chosen uniformly from the spectral window [Aq, Aq + e], 
is bounded by 

|P(/([/)<x)-$(x,CA„)|<A(n,e) (24) 

where A(n, e) ^ as e ^ but en — > oo. 

• Hypothesis II (multivariate): For a good distance matrix D, the distribution of 
f^'^"\U) converges as n — > oo, to the multinomial distribution (23) with covariance 
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matrix Ca,,, where Z*-^"^ is an adjacency eigenvector with the closest eigenvalue to 
Aq of a uniformly chosen (n, d) graph and U satisfies Dg{U) = D. 

Two remarks are in order. First, in hypothesis I we avoid the question how small 
should diam(t/) be, as we base the hypothesis mainly on numerical simulations, which 
are applicable for small-diameter distance matrices only (see section 5.2). Second, we 
should note that in equation (23) we assume the existence of the inverse covariance 
matrix. The existence of an inverse for Cx^ will be discussed in the next section, where 
we show that for distance matrices which contain a vertex and all of its neighbors, 
the covariance is singular. We also demonstrate how this calculative obstacle can be 
removed by a simple coordinate transformation. 

Unlike the univariate conjectures, a comprehensive numerical examination of the 
multivariate versions is a hard task. As a beginning, we had to make do with 
the comparison of the empirical cdf of two adjacent vertices to (24), where for this 
configuration Cx is given by (Ca)ii = (Ca)22 = 1, {C\)i2 = (Ca)2i = A/rf. 
For k iid bivariate normal variables with mean zero and covariance Cx, the empirical cdf 
is expected to converge to <l>(x, Ca) as \j\fk. In order to check the second hypothesis 
we have measured the value of two adjacent vertices (/i , j^) o'^^^ 20000 realizations 
of (6000,4) graphs from the eigenvector with the closest eigenvalue to some A. In order 
to evaluate the rate of convergence as a function of the number of realizations, we have 
calculated for various values of /c, the empirical cdf for the first k measurements: 

^,^(X) = i{#2|2 < k, j^^ < Xi, /(^^ < X2} (25) 

Finally, for every lambda and k, we have calculated max |$(x, Ca) — J^a(^)|- ^^ 
demonstrated in figure 5 the deviation does decrease as k increases, however the 
convergence is slower than the one measured for iid bivariate normal variables. In 
addition, for larger eigenvalues (smaller Laplacian eigenvalues), we have observed a 
faster convergence. 
The first hypothesis is harder to test directly, as it requires the generating (and more 
problematic, the diagonalization) of a very large graph, in order to have a narrow spectral 
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Figure 5. The maximal deviation between the cumulative distributions J-^{yi) and 
^(xjCa) as a function of fc - the number of generated graphs (logarithmic scale). The 
different curves correspond to Ai — 3.44, A2 = and A3 = —3.44. The maximal 
deviation is compared to that of k iid vectors, generated with respect to the measure 

$(x,C_3.44). 

window which contains many eigenvectors. By using MATLAB's function eigs.m We 
have explored relatively narrow spectral windows {Xmax — ^min ~ 0.1) of (13000,3) 
graphs, which contains between 200 to 400 eigenvectors (the exact number depends on 
the spectral density at A). The KS distance between the empirical cdf for those windows 
and $(x, Ca), was consistent with the measured deviation in the previous experiment 
for the same number of samples. 

An additional support to the Gaussian approximation will be introduced in the next 
section, where we reconstruct the structure of the nodal pattern of an eigenvector, 
assuming the suggested normal distribution. 



5. The nodal structure of eigenvectors 

For a graph G = (y,E) and a function f{V), we define the induced nodal graph 
Gf = (y, Ef) , by the deletion of edges, which connect vertices of opposite signs in 
/: Ef = {{vi,Vj) G E\fifj > 0}. In this section we analyze the nodal pattern of the 
eigenfunctions, assuming the multinormal distribution, as stated in hypothesis II. 
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We will demonstrate that this assumption allows us not only to evaluate the expectation 
of the nodal count, but also to estimate the distribution of the size and shape of 
domains. In particular we will demonstrate that the nodal structure cannot be imitated 
by percolation-like models. 



5.1. Distribution of valency 

We begin by calculating Pe(A) - the probability of an edge e e i? of a random graph, 
to belong to Ef for an eigenvector / with eigenvalue A. This is twice the probability of 
two adjacent vertices to be positive, which according to hypothesis II, equals 



where (C 



AMI 



Pe(A) = 2 / / rff 

JQ Jo 

= (Ca)22 = 1 5 (Ca)i2 = 



1 



exp --(f,C,-^f) 



Pe{^) = o + -arcsm 

Z 71 \ a 



27rv^ -^ V 2 
= (Ca)21 = A/(i. Integrating, we get that: 
A~ 



(26) 



(27) 



Pe(A) is symmetric with respect to A = 0. In addition, since |A| < 2\^d — 1, for small 




-1 -D,8 -O.B -0.4 -D,2 0.2 0,4 0.6 0. 



Figure 6. A comparison between the Gaussian prediction for Pe{X) (lines) and the 
empirical result (markers) for a single random realization of 3,6 and 15 regular graphs 
on 4000 vertices. 



values of d, Pe(A) varies considerably along the spectrum (thus, for d = 3, Pe(A) can take 
values in the interval [0.1, 0.9]), while for large d the changes are moderate (for d = 100 
for example, it is constrained to [0.44,0.56]). As demonstrated in figure 6, this result 



Eigenvectors of the discrete Laplacian on regular graphs - a statistical approach 19 

describes with high accuracy the observed probabihty, for various values of d. 
The Gaussian model predicts as well a distribution for the valency of vertices in Gf. 
In order to evaluate Pj{X) - the probability of a vertex fo G V^ to be of valency j in 
Gf, one should consider the mutual distribution of /o and its d neighbors {/i,.../^}. 
The appropriate covariance entries are given by (Ca)m = 1, (Ca)oj = {Cx)jo = ^/d and 
{Cx)jk = (A^ - d)/d{d - 1), ioT < i < d ,1 < J, k < d and j y^ k. 

This matrix is singular, as (A, —1, —1, ... — 1)^ is an eigenvector of Ca with zero eigenvalue. 
This singularity is due to the constraint A/o = X]-; fj which is kept by the Gaussian 
model. In order to avoid the singularity we may integrate with respect to the new 
Gaussian variables f = (A/o — J2j fj^ /i' ■■■fdY- Introducing the (invertible) matrix Ca, 
obtained from C\ by changing the of diagonal terms in the zeroth row and column to 
zero, Pj{\) can be written as: 



P. (A) 



{2txY\Cx\ i=i i=j+i 



(28) 



where the prefactor is due to the sign symmetry and the different alternatives to 
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Figure 7. A comparison between the Gaussian prediction of pj{X) and the empirical 
result for a single realization of G(4000,6). 



choose j out of d adjacent vertices. An immediate result of this expression is the 
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symmetry Pj{X) = pd~j{—\)- This integral cannot be calculated explicitly, however it 
can be evaluated, e.g. by the method of [30]. As in the study of Pe{^) the Gaussian 
prediction is very close to the observed results. As an example, in figure 7 we compare 
the Gaussian prediction for Pj{\) and rf = 6, evaluated by the function qscmvnv.m [31], 
to the measured result for a single realization of a (4000, 6) graph. 

5.2. The nodal count of an eigenfunction 

In [1], the following intriguing properties of the nodal count {vj}^^i for the eigenvectors 
of {n,d) graphs were observed. First, for all j < JQ{d,n), Vj was found to be exactly 
2, where the relative part Jq/u of eigenvectors with exactly two nodal domains is 
increasing with d. Second, for small values of rf, and for j > jo, the nodal count 
increases approximately linearly with j. While the known bounds on the nodal count 
(see section 2.2) are far from being satisfactory in explaining this behavior, we would 
like to demonstrate in this section, how does the expected nodal count emerges from 
the Gaussian model. 

Adopting the Gaussian expression (27) for Pe{^), it is possible to derive a lower bound on 
the expected nodal count of an eigenvector. The number A^ of connected components of 
a graph G = {V, E), is given hjN = V — E + C, where C is the number of independent 
cycles in G. Since on average, the induced nodal graph Gf posses Pe{^) ■ \E\ = ^Pe(A) 
edges and n vertices, the expected nodal count is bounded from below (for all of the 
eigenvectors but the first) by 

E (z/(n. A)) > max h, n (l - ^Pe{\)] | (29) 

We should note that this bound is effective only for d < 7, as for larger values, 
1 — dpe{\)/2 is negative for |A| < 2\^d — 1. 

For low values of d this crude bound matches surprisingly well the observed nodal 
count, as is demonstrated in figure 8 for a (4000, 3) graph. The good agreement can 
be understood if we consider the critical properties of the nodal pattern. Numerical 
observations [32] suggest that for d < 5, the induced nodal graph Gf exhibits a phase 
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Figure 8. A comparison between (29) and the observed count for a single realization of 
G(4000, 3) . The inset is a magnification of the spectral window near the bound's flexion 
- the only part of the spectrum in which the observed count deviates considerably from 
the bound. 

transition at some Ac as n — > cxo. In the subcritical phase (A < Ac), the size of the 
largest nodal domains is proportional to logn, while in the supercritical phase (A > Ac), 
two giant components of order n emerge. 

As the number of connected components of size logn in an (ra, d) graph, which contain 
log(logn) cycles is almost surely zero, the expected number of independent closed cycles 
in Gf in the subcritical regime must be much smaller than ?T,log(logn)/logr2 (as there 
cannot be more than n/logn domains comparable in size to logn). As a result, for 
A < Ac the deviation between (29) to the expected count is at most of order l/log(n). 
This result is reflected in figure 9, which demonstrate that for rf < 5 (where a subcritical 
phase is observed), the measured count converges to (29), for low enough eigenvalues. 
The fact that only two nodal domains are observed for a large number of eigenvectors 
is also consistent with the existence of a supercritical phase. A general property of 
supercritical systems is the scarcity of large but finite clusters, the expected number 
of clusters of size s decays asymptotically as exp(— s(p — Pc)''^), for some (model 
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Figure 9. The deviation of v\/n from the lower bound (29), for 3,4,5,6 and 7 regular 
graphs on 4000 vertices. 



dependent) positive 7. Therefore, the supercritical phase consists of a giant component 
and 'dust'. When we consider the nodal pattern of supercritical eigenfunction (i.e. 
those with A > Ac) two special phenomena occur. The first is the appearance of two 
giant components - a positive and a negative domains. The second is the rarity of 
small domains: as was mentioned in section 2.2, the distribution of the eigenvectors 
is constrained, preventing the existence of small domains for large enough values of 
A. As a result, we expect to find only rarely more than 2 nodal domains, for a 
considerable amount of first eigenvectors (which are deep enough in the supercritical 
regime). Moreover, as d increases, the value of Ac decreases, therefore the expected 
number of such eigenvectors is supposed to increase with d (as is indeed observed). 
As the size distribution of clusters is expected to decay rapidly, we can tighten the 
bound on the expected nodal count considerably, by calculating z/fc(n. A) - the expected 
number of domains of size k for small values of k. It is easy to see that vi{n, A) = npQ^X) 
where Pq{\) is given by (28). For k > 1 the calculation can be carried out in the same 
spirit: The probability for k given vertices to form a nodal domain of size k can be 
evaluated, through a {k{d — 1) + 2) dimensional integral (over the k vertices and their 
fc((i — 2) + 2 neighbors), in a similar manner to (28). Finally, i^fc(^; -^) is given (up to 
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small corrections) by summing the probabilities over all trees of size k and maximal 
valency d, multiplied by the number of such trees in G. The agreement between the 
Gaussian prediction to the observed distribution of Vk is demonstrated in figure 10. As 
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Figure 10. The Gaussian prediction for i^^/n for d = 3 and 1 < A: < 40 (lines), 
compared to the observed count (markers) for a single realization of G(4000, 3). 

^i^) = YlT=i ^k, we get that X]i=i ^k{n,X) should converge to E{h'{X,n)). In figure 11 
we plot the maximal deviation between Yl ^k (for 1 < A; < 4) and the measured count 
of (4000, d) graphs for 3 < d < 10. It can be seen that the converges is much faster for 
d > 5. This is consistent with relatively slow decay in the size distribution, which is 
expected in the vicinity of the critical point. 



5.3. Eigenvectors and percolation 

For a G{n, d) graph and < p < 1, the induced percolation graph G{n, d,p) is obtained 
by deleting the edges of G independently with probability 1 — p. As was mentioned in 
section 2.1, for two-dimensional billiards, it is believed that the nodal pattern exhibits 
(in the semi-classical limit) a percolation-like behavior [2]. In this section we compare 
the properties of G{n, d,p) to the nodal pattern of an eigenvector, satisfying Pe{X) = p 
(see eq. 27). 
An important difference between the two models is their valency distribution. For 
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Figure 11. The maximal deviation between the measured nodal count of 3 < d < 10 
regular graphs on 4000 vertices to the expected number of domains, smaller or equal 
than 1,2,3 and 4 vertices. 



percolation, as the deletion of different edges is independent, the probability of a vertex 
in G(n, d,p) to be of valency j is given by 



prip) 



p\l-p) 



d-l 



(30) 



which is essentially different from Pj{\) (28) - the equivalent expression for the nodal 
pattern. As will be demonstrated soon, by changing the valency distribution, we change 
global properties of the pattern as well. 

The differences between the two processes are not limited to local measures such as pj, 
but also for events which involve several vertices. For example, the probability to have 
a connected cluster of size k in G(n, d^p) is positive for any p > 0, and can be expressed 
through pf^^^{p)- As for the nodal pattern, for any A; G N there is some \maxi so that for 
A > Xmax the probability to have a domain smaller than k is zero.^ In addition, as was 
demonstrated above, the quantities Vk cannot be reduced to simple functions of pj(A). 
Finally, we would like to show that the critical threshold for the two models is different. 
In [33] the critical probability for G{n,d,p) was found to be pc = l/(<i — !)• It is not 
hard to verify that for the nodal pattern Pe(Ac) > ^/{d — 1). Consider for example the 

% This is so, as the probability to find a connected component in G of size k with more than 1 cycle 
goes to zero as n — > oo, allowing the use of bounds reminiscent of [25]. 
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case d = 3, where pc = 1/2 = Pe(A = 0). As was mentioned before, for A = there are 
no interior points in the nodal domains, therefore, for d = 3 all nodal domains must 
be linear chains, however, percolation on linear chains is always subcritical, therefore 
necessarily Ac > 0. By similar arguments, this will also be the case for d > 3. 
We would like to note that the mismatch between the properties of the Laplacian nodal 
pattern and G{n, d, p) should not come as a great surprise. One of the main arguments in 
favor of the percolation model [17] for 2 dimensional billiards, is that the asymptotic rate 
of decay of the eigenvectors' covariance (4) is fast enough in order to neglect correlations, 
according to the so called 'Harris criterion' [34]. However, the covariance for eigenvectors 
of (n,(i) graphs (13), does not fulfill the requirements of this criterion, implying that 
the scaling limit of the two models will differ. We should note that the requirements 
of this criterion are not fulfilled for billiards in more than two dimensions as well. This 
suggest that the resemblance between the nodal pattern of Laplacian eigenvectors to 
percolation is a two dimensional phenomenon. 

6. Conclusions 

As a summary, we collect the main new results of this work concerning the structure of 
Laplacian (Adjacency) eigenvectors: 

(i) In the limit n -^ oo, the empirical covariance (8) of an eigenvector of a uniformly 
chosen (n, d) graph, for a distance k < logd-i{n)/2 is given by 

C-f"(^' - whfn ((" - 1)^' (^) - ''-^ (i7^)) Pi) 

which decays exponentially with k. For k > log^_]^(n)/2, this approximation loses 
its accuracy, however, the observed rate of decay is still exponential in k. 

(ii) We provide numerical evidence in support of the hypothesis that the distribution 
of the adjacency (or Laplacian) eigenvectors follows the Gaussian measure 

^(/(f/),C,-;/(f/))l (32) 

Aol L ^ J 

For any Aq G [-2y/d^^,2y/d^^]. 



pifiU)) = , ^exp 

V(2^)"1CaoI 
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(iii) We used the Gaussian measure to predict the expected number of nodal domains 

in an eigenfunction and its dependence in the eigenvalue. We have shown the 

consistency of the Gaussian hypothesis with various nodal properties, such as 

valency and size distribution. 

(iv) We have shown that the nodal structure of the Laplacian eigenvectors differ from 
the cluster structure of G{n, d,p). The two models are not sharing the same critical 
point, and the structure of a typical components does not follow the same law in 
the two cases. 
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